On Improved Example-based Search in Digital Libraries via Term Ranking

نویسندگان

  • SULIEMAN BANI-AHMAD
  • GHADEER AL-DWEIK
چکیده

Example-based searching, where user provides an example publication to locate similar publications to, is becoming commonplace in literature digital libraries. Two approaches to estimate similarities between publications are (i) graph based approaches where citation relationships amongst publication are used to compute similarities, and (ii) text-based approaches where observing shared terms between publications is used as indicator of similarity. In this paper we introduce a new text-based publication-similarity measuring technique that enhances existing example-based searching through utilizing term importance information. Term importance is computed via a proposed graph-based term ranking (GBTR) algorithm. The GBTR algorithm is different from previous term ranking approaches as it recursively computes term importance from the entire publication where it is observed, rather than relying only on local specific information. GBTR works well when paired with Okapi BM25. We exhaustively evaluate the performance of GBTR and compare it against the performance of existing term-ranking methods such as the Chronological Term Rank (CTR) and the Term Proximity models. Significant improvements, in terms of precision, over existing approaches are observed. GBTR achieved around 10% improvement in precision over CTR and around 2% over TP with much less computational time and space complexity than the TP approach.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Using Interactive Search Elements in Digital Libraries

Background and Aim: Interaction in a digital library help users locating and accessing information and also assist them in creating knowledge, better perception, problem solving and recognition of dimension of resources. This paper tries to identify and introduce the components and elements that are used in interaction between user and system in search and retrieval of information in digital li...

متن کامل

A Plugin Architecture Enabling Federated Search for Digital Libraries

Today, users expect a variety of digital libraries to be searchable from a single Web page. The German Vascoda project provides this service for dozens of information sources. Its ultimate goal is to provide search quality close to the ranking of a central database containing documents from all participating libraries. Currently, however, the Vascoda portal is based on a non-cooperative metasea...

متن کامل

A Comparing between the impacts of text based indexing and folksonomy on ranking of images search via Google search engine

Background and Aim: The purpose of this study was to compare the impact of text based indexing and folksonomy in image retrieval via Google search engine. Methods: This study used experimental method. The sample is 30 images extracted from the book “Gray anatomy”. The research was carried out in 4 stages; in the first stage, images were uploaded to an “Instagram” account so the images are tagge...

متن کامل

Reducing semantic complexity in distributed digital libraries

Purpose – The general science portal ‘‘vascoda’’ merges structured, high-quality information collections from more than 40 providers on the basis of search engine technology (FAST) and a concept which treats semantic heterogeneity between different controlled vocabularies. First experiences with the portal show some weaknesses of this approach which come out in most metadata-driven Digital Libr...

متن کامل

Federated Search of Text-Based Digital Libraries in Hierarchical Peer-to-Peer Networks

Peer-to-peer architectures are a potentially powerful model for developing large-scale networks of text-based digital libraries, but peer-to-peer networks have so far provided very limited support for text-based federated search of digital libraries using relevancebased ranking. This paper addresses the problems of resource representation, resource ranking and selection, and result merging for ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010